Database system and a method of data retrieval from the system

ABSTRACT

Since metadata pertaining to real data stored in at least one database (DB) are collected and managed at a single meta DB server, and metadata that match a retrieval request are extracted by search of the meta DB server, even when a plurality of DBs and DB servers for managing DBs are present on a network, all metadata that match the retrieval request can be extracted independently of which DBs metadata pertain to. Hence, all data that match a retrieval request can be obtained from a single server independently of the actual locations of the distributed DBs and DB servers.

BACKGROUND OF THE INVENTION

1. Field of the Invention

The present invention relates to a database system, data retrievalmethod, and storage medium and, more particularly, to a techniquesuitably used in a retrieval system that finds out desired data from aplurality of distributed databases.

2. Description of the Related Art

As the performance of computers becomes higher in recent years, alarge-scale computer such as a single main frame or the like has beenreplaced by a distributed system built by a plurality of workstations orpersonal computers in recent years. The distributed system makesdevelopment and maintenance of the system relatively easy. As an exampleof the distributed system, the so-called Internet is known.

In the Internet, a plurality of computers are distributed worldwide asservers or clients, and construct a single, huge database (to beabbreviated as a DB hereinafter). Text information, image information,and the like are registered in these DBs or are read out therefrom usingsome protocols. Not only in such Internet, but also in a system thatdeals with a huge volume of data, DBs tend to be distributed.

When desired information is read out from such distributed DBs, itrequires very much time and labor to search all servers that managethese DBs for required information. More specifically, since the userdoes not know the location of information to be read out in thedistributed DBs, he or she must access servers allocated incorrespondence with these DBs in turn and must repeat search until he orshe finds desired information.

It is impossible to retrieve required information from all the serversunless the user knows the locations (address information such as URL:Uniform Resource Locator) of all DB servers. However, the distributed DBservers constantly register or delete data, and each DB server itself isconstantly connected to or disconnected from the network. Hence, it isvery hard for the user to recognize all these facts and to retrieveaccurate information.

In order to eliminate such inconvenience, address retrieval servicescalled search engines are available in, e.g., the Internet. Each searchengine collects URL information automatically or manually, and arequired URL can be retrieved by inputting, e.g., a keyword. Forexample, if a search using a keyword “patent” is made, the URLs ofservers relevant to “patent” are output.

However, the search engine can only retrieve the URL information of a DBserver, but cannot search an RDBMS (relational DB management system)built in the server at that retrieved URL. Therefore, in order to searchan RDBMS or the like, the user retrieves information of a desired serverfrom the search engine, and then connects to the desired server on thebasis of the retrieval result. Then, the user searches the DB for his orher required information using a DB retrieval method corresponding tothat server.

In this way, conventionally, upon acquiring desired data, when DBs thatstore various kinds of data are distributed, data retrieval requiresmuch time and labor.

Furthermore, in the RDBMS, the maximum number of columns that can beheld in one table is normally limited. Hence, in an RDBMS, the maximumnumber of columns of which is limited to 256, when a table having 257 ormore columns is created, a plurality of tables (real tables) eachincluding 256 columns or less are generated, and are related toapparently build a database as a single table (view).

For example, single view X shown in FIG. 1 is made up of three realtables A, B, and C, which are related. More specifically, identical datais stored in key columns a1, b1, and c1 on real tables A, B, and C, andcolumn x1 of view X is formed using these columns a1, b1, and c1 asjoint keys, thus maintaining consistency among the three independenttables. That is, column x1 on view X is common to three columns a1, b1,and c1.

Also, columns a2, a3, and a4 on real table A correspond to columns x2,x3, and x4 on view X, column b2, b3, b4, and b5 on real table B tocolumns x5, x6, x7, and x8 on view X, and columns c2 and c3 on realtable C to columns x8 and x9 on view X, respectively. Paying attentionto column x8 on view X, two columns, i.e., column b5 on real table B andcolumn c2 on real table C are related to this column. In other words,these columns b5 and c2 on real tables B and C store identical data.

A protocol for creating single view X from three real tables A, B, and Cis as follows:

-   -   create view viewX (x1, x2, x3, x4, x5, x6, x7, x8, x9)    -   as select a1, a2, a3, a4, b2, b3, b4, b5, c3 from TableA,        TableB, TableC    -   where a1=b1 and a1=c1 and b5=c2

However, when such DB having a plurality of real tables A, B, and C issearched for given data, the following problem is posed. That is, in aconventional DB system, since a search is made by calling all therelated real tables, all real tables A, B, and C are to be searchedirrespective of real table in which desired data is located, and theindividual real tables are searched in turn in accordance with a searchformula input by the user.

Assuming that data to be retrieved pertains to columns x8 and x9 on viewX, since column x8 on view X has data common to columns b5 and c2 onreal tables B and C, actual search can be completed using only realtable C that corresponds to both columns x8 and x9 without using realtable B. Since columns x8 and x9 on view X correspond to none of thecolumns on real table A, there is no need for searching real time A inpractice.

More specifically, in the conventional DB system, a broad range issearched by joining real tables more than required. Such processingprolongs the DB search time, and requires a more memory area of thecomputer that forms the system than required, resulting in low searchperformance.

When the user searches the DB, all the real tables must be joined.However, since the number of columns is also limited on a view providedby an RDBMS as in a real table, a long view cannot be formed beyond thephysical limitation. Therefore, upon observing the contents of a viewbeyond the physical limitation, the contents must be presented to theuser in units of real tables or by preparing a customized applicationprogram which manages data in units of real tables.

SUMMARY OF THE INVENTION

It is an object of the present invention to provide a mechanism thatallows the user to easily search a DB system built by distributed DBsand their servers without requiring immediate connectivity to thedistributed DB servers.

It is another object of the present invention to provide a mechanismwhich always recognizes information pertaining to each DB stored indistributed servers, and allows the user to retrieve the latestinformation at the time of search as a result without exerting any extraload on the user.

It is still another object of the present invention to provide adatabase system which can join tables at high speed with a minimumrequired memory capacity in a relational database system.

It is still another object of the present invention to create a longview beyond the physical limitation on a database.

In order to achieve the above objects, a database system built bydistributing one or more databases and one or more first servers whichsearch the databases for real data on a network, comprises metadatamanagement means for collecting metadata which pertain to real datastored in the one or more databases from the one or more first servers,and managing the collected metadata, and metadata retrieval means forextracting metadata which matches a request from a user terminalconnected to the network by search in response to the request.

Note that the metadata management means and metadata retrieval means maybe located in one or more second servers different from the firstservers.

According to another feature of the present invention, the metadatacontains at least information indicating a location of the database orthe first server, and contents of real data in the database.

According to still another feature of the present invention, the userterminal comprises means for inputting a retrieval request of themetadata, means for inputting a retrieval condition upon retrieving realdata on the database using a retrieval result of the metadata suppliedfrom the metadata retrieval means, and means for transferring the inputretrieval condition to the first server indicated by the extractedmetadata as a retrieval request.

According to still another feature of the present invention, the systemfurther comprises means for providing a form for inputting the retrievalcondition, which form can be commonly used irrespective of the retrievalresult of the metadata retrieval means.

According to still another feature of the present invention, the firstserver comprises means for converting the retrieval request to thedatabase transferred from the user terminal into a format concordantwith the database to be accessed.

According to still another feature of the present invention, the firstserver comprises metadata saving means for creating and saving metadatathat pertains to the database managed by that first server, and thesecond server comprises means for acquiring corresponding metadata whendata stored in the metadata saving means has been updated.

Note that the second server may comprise means for acquiring data storedin the metadata saving means at a predetermined time interval.

A data retrieval method according to the present invention comprises thesteps of: collecting metadata that pertain to real data stored indatabases distributed on a network by a second server via first serversdistributed on the network, and saving the collected metadata;extracting metadata that matches a request by search of the collectedmetadata; inputting a retrieval condition for the database on the basisof a retrieval result of the metadata; issuing a retrieval request ofthe real data to the first server indicated by the extracted metadata;and retrieving, by the first server, the real data from thecorresponding database in accordance with the retrieval request.

A recording medium according to the present invention records a programfor making a computer implement, in a database system built bydistributing on a network one or more user terminals, one or moredatabases, one or more first servers for searching the databases forreal data, and one or more second servers for collecting metadata whichpertain to real data stored in the one or more databases from the one ormore first servers and managing the collected metadata: a function ofcollecting the metadata of the distributed databases by the secondserver via the first servers; a function of extracting metadata whichmatches a retrieval request from a user by search of the collectedmetadata in response to the retrieval request; a function of inputting aretrieval condition for the databases at the user terminal on the basisof a retrieval result of the metadata, and issuing a retrieval requestof the real data to the first server indicated by the extractedmetadata; and a function of retrieving the real data by the first serverin accordance with the retrieval request.

Since the present invention is comprised of the aforementioned technicalmeans, when a retrieval request is issued to the second server thatcollects and manages metadata pertaining to one or more databases, allmetadata that match the request are retrieved and presented to the user.Even when a plurality of databases and first servers that manage thesedatabases are present on a network, when a retrieval request is issuedto the second server, all metadata that match the retrieval request areretrieved independently of which databases metadata pertain to. For thisreason, the user can obtain all data that match his or her retrievalrequest from one server as long as he or she knows only the location ofthe second server, even if he or she does not know the locations of thedistributed databases or the first servers. In this way, the secondserver comprises means for providing metadata to other computers, andalso provides a gateway for search common to all the databases byintegrating the schemata of databases on the network.

According to another feature of the present invention, since metadatacontains the location of at least the database or first server, andinformation that represents the contents of real data in the database,the user can detect the location of the database which stores real datathat matches the retrieval request, or the first server that manages thedatabase simultaneously with retrieval of metadata. Consequently, theuser need only know the location of the second server, and need not beaware of the locations of the distributed databases and first serversfor managing these databases.

According to still another feature of the present invention, since theuser terminal comprises means for inputting a retrieval condition uponsearching for real data using the retrieval result of metadata, andmeans for transferring the input retrieval condition to the first serveras a retrieval request, the user, who has detected the location of thedatabase that stores desired real data on the basis of the retrievalresult of the metadata, can search for real data by inputting theretrieval condition.

According to still another feature of the present invention, since thefirst server comprises means for converting the retrieval request forthe database transferred from the user terminal into a format concordantwith the database to be accessed, the user need only create and issue aretrieval condition that fulfills a standard format without takingnotice of the sites of individual distributed databases, thus building amultidatabase system of a plurality of different kinds of distributeddatabases.

According to still another feature of the present invention, since thesecond server comprises the function of retrieving metadata when thefirst server has updated metadata or at given time intervals, metadatacollected at the second server allow the user to always obtain thelatest information at the time of search as a result, thus flexiblycoping with changes in the system.

In another aspect of a database system according to the presentinvention, a database system, which searches a plurality of tablesjoined by a relational database, comprises table extraction means forextracting one table including columns that store data to be retrievedfrom a plurality of tables, and column exclusion means for excludingcolumns of the table extracted by the table extraction means and columnson other tables which store the same data contents as data contents ofthe columns on the extracted table from columns to be extracted insubsequent processing, and the tables extracted in turn by the tableextraction means are joined when the processing of the table extractionmeans and the processing of the column exclusion means have beenrepeated until all the columns including data to be retrieved areanalyzed.

According to another feature of the present invention, the tableextraction means extracts one table including a largest number ofcolumns which store data to be retrieved from the plurality of tables.

According to still another feature of the present invention, the systemfurther comprises metadata management means for collecting and managingmetadata which pertain to joining of the plurality of tables, andwherein the table extraction means extracts the table on the basis ofthe metadata stored in the metadata management means.

According to still another feature of the present invention, the systemfurther comprises retrieval means for retrieving objects in accordancewith a retrieval key, and data is retrieved from the tables which areextracted in turn and joined by the table extraction means.

In another aspect of a data retrieval method of the present invention, amethod of data retrieval from a database, processing for extracting atable and processing for excluding columns including identical data uponsearch by joining a plurality of tables by a relational database arerepeated in such a manner that one table including columns that storedata to be retrieved is extracted from the plurality of tables, columnswhich store the same data contents as data contents of columns on theextracted table of other tables are excluded, and another table isextracted from the remaining tables, and one or more tables extracted inturn at that time are joined.

According to another feature of the present invention, upon extractingone table from the plurality of tables, one table including a largestnumber of columns that store data to be retrieved is extracted.

According to still another feature of the present invention, data isretrieved from the one or more joined tables.

Another aspect of a recording medium of the present invention records aprogram for making a computer implement: means for extracting one tableincluding columns that store data to be retrieved from a plurality oftables upon search by joining a plurality of tables by a relationaldatabase; means for excluding columns of the extracted table and columnson other tables which store the same data contents as data contents ofthe columns on the extracted table from columns to be extracted insubsequent processing; and means for joining the tables extracted inturn when the processing of the two means have been repeated until allthe columns including data to be retrieved are analyzed.

According to another feature of the present invention, the program makesthe computer further implement retrieval means for retrieving objects inaccordance with a retrieval key from the tables extracted and joined bythe table extraction means.

Since the present invention is comprised of the aforementioned technicalmeans, when columns with identical data contents are present indifferent tables, the columns with the identical data contents arehandled as the one that belongs to one of the tables (e.g., a tablehaving the largest number of columns to be retrieved) during theprocess, and all tables that contain identical columns are not alwaysjoined. Also, a table that includes no columns to be retrieved is notextracted as tables to be joined. In this manner, an unnecessarily largenumber of tables can be prevented from being joined.

According to another feature of the present invention, since a pluralityof tables are not independently managed but are systematically managedby collecting metadata pertaining to table joining, the user can see theplurality of tables as one table although they are merely joined oneswhen viewed from the database system. That is, a long view beyond thephysical limitations, e.g., the number of columns, of a database can beformed.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 is a view for explaining the relationship among real tables;

FIG. 2 is a schematic block diagram showing the arrangement of adatabase system according to the first embodiment of the presentinvention;

FIG. 3 is a block diagram showing software and hardware images of thedatabase system according to the first embodiment;

FIG. 4 is a flow chart showing the flow of the overall processing of thedatabase system according to the first embodiment;

FIG. 5 is a flow chart showing the flow of meta DB update processing inthe database system according to the first embodiment;

FIG. 6 is a flow chart showing a series of retrieval processes of thedatabase system according to the first embodiment in correspondence witha user terminal, DB server, and meta DB server;

FIG. 7 shows an example of the directory structure (layer structure) ofmetadata;

FIG. 8 shows an example of description of a metadata file pertaining toa DB;

FIG. 9 is a block diagram showing an example of the arrangement of adatabase system according to the second embodiment of the presentinvention;

FIG. 10 is a view for explaining the retrieval operation of the databasesystem according to the second embodiment (a data retrieval methodaccording to this embodiment);

FIG. 11 is a view for explaining the retrieval operation of the databasesystem according to the second embodiment (a data retrieval methodaccording to this embodiment);

FIG. 12 is a view for explaining the retrieval operation of the databasesystem according to the second embodiment (a data retrieval methodaccording to this embodiment); and

FIG. 13 is a view for explaining the retrieval operation of the databasesystem according to the second embodiment (a data retrieval methodaccording to this embodiment).

DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS First Embodiment

FIG. 2 is a schematic block diagram showing the arrangement of adatabase system according to the first embodiment, and FIG. 3 is a blockdiagram showing software and hardware images of the database searchsystem.

Referring to FIG. 2, reference numeral 10 denotes a user terminal whichis installed with a WWW (World Wide Web) browser 11, and which inputs akeyword for search, issues a retrieval request, displays a retrievalresult, and so forth on the WWW browser 11. A search for a meta DB (tobe described later) or for real data is designated on this WWW browser11. The WWW browser 11 comprises a GUI module 11 a and retrieval requesttransfer module 11 b, as shown in FIG. 3, allows the user to makevarious operations for search using the GUI module 11 a, and transfers aretrieval request signal using the retrieval request transfer module 11b.

Reference numeral 20 denotes a database (DB) which stores actual data.At least one DB 20 is present on the network. A platform or DBapplication used for searching the DB 20 is not particularly limited.For example, data retrieval can be done by search means such as SQL(Structured Query Language: a database language for a relational DBhaving a table style data structure).

Reference numeral 30 denotes a DB server which is installed with anHTTPD (Hyper Text Transfer Protocol) 31 comprising a retrieval requestreceiving module 31 a and metadata providing module 31 b, and aretrieval executing module 32 made up of a DB network client, as shownin FIG. 3. Basically, one DB server 30 is provided to one DB 20, but oneDB server 30 may manage a plurality of DBs 20.

The DB server 30 searches the DB 20 in response to a retrieval requestfrom the user terminal 10, and sends back the retrieval result to theuser terminal 10 in the HTML (Hyper Text Markup Language) format. Inthis case, the DB server 30 translates the retrieval request from theuser terminal 10 into a format concordant with the DB 20 to absorb anydiscrepancies among different kinds of DBs. The DB server 30 alsocreates and stores metadata 33 pertaining to real data stored in the DB20 managed by itself.

Note that metadata is information for managing data such as theattributes, semantic contents, acquisition sources, storage locations,and the like of data stored in the DB 20. In this embodiment,especially, the metadata contains at least data contents indicating whatkinds of data the DB 20 store, and a URL used for accessing that DB 20.The DB server 30 provides the metadata 33 to a meta DB server 40 (to bedescribed later) in response to a request from the meta DB server 40.

FIG. 7 shows an example of the directory structure of the metadata 33.FIG. 7 shows an example wherein one DB server 30 manages a plurality ofDBs 20 (DB1, DB2, DB3, . . . ). As shown in FIG. 7, the directory ofeach DB contains a metadata file pertaining to that DB, a metadata filepertaining to tables that form the DB, and a metadata file pertaining tocolumns 1 to n in the tables.

FIG. 8 shows an example of the contents of the metadata file concerningthe DB. As shown in FIG. 8, metadata for the DB include informationDBKEY, DBEXPL, DBMS, DBLIMIT, SERIAL, CHECK, RETRY, EMAIL, TBLFILE,DBNAME, and SQLURL. Such information will be explained in turn below.

DBKEY indicates the keyword for the DB, and describes a keyword “demo,member” in FIG. 8. DBEXPL indicates the content comment of the DB, anddescribes a comment “member information” in FIG. 8. DBMS indicates thename of the DB system, which describes the name“OracleWorkgroupServer—7.3” in FIG. 8. DBLIMIT indicates the accesslimitations to the DB, and describes “allow@foo.co.jp;deny all” in FIG.8. That is, DBLIMIT describes that accesses by users having addressescontaining “@foo.co.jp” are granted, but accesses by other users aredenied.

SERIAL indicates the serial number of metadata used for update checking,and describes “19971225000000”, i.e., indicates that metadata have beenupdated on Dec. 25, 1997, in FIG. 8. CHECK indicates the time interval[sec] for update checking of metadata, and describes the time interval“3,600 sec” in FIG. 8. RETRY indicates the retry time interval [sec]when metadata update checking fails, and describes the time interval“600 sec” in FIG. 8.

EMAIL indicates the mail address of the administrator of the DB, anddescribes the address “dbnavi@foo.co.jp” in FIG. 8. TBLFILE indicatesthe file name that describes metadata pertaining to the tables of theDB, and describes the name “tables.html” in FIG. 8. DBNAME indicates theregistered name of the DB, and describes the name “demo@navi.foo.co.jp”in FIG. 8. SQLURL indicates the URL of the retrieval execution requestdestination to the DB, and describes the URL“http://navi.foo.co.jp:8080/servlet/DBNAVI/service” in FIG. 8.

Although not shown, the metadata file of the tables and those for thecolumns describe a plurality of metadata having the same format as thatof the metadata file of the DB shown in FIG. 8 but different contents.The metadata for the tables include information COLFILE, TBLNAME,DBNAME, TBLOWNER, TBLNAME2, TBLKEY, TBLEXPL, TBLDATE, TBLMODIFY, NCOLS,NROWS, and TBLLIMIT. Such information will be described in turn below.

COLFILE indicates the file name that describes metadata pertaining tothe columns, TBLNAME indicates the table name, and DBNAME indicates theDB name, which are the same as those described in the metadata file forthe DB shown in FIG. 8. TBLOWNER indicates the owner name of the table,TBLNAME2 indicates the table name in Japanese, TBLKEY indicates thekeyword pertaining to the tables, TBLEXPL indicates the content commentsof the tables, TBLDATE indicates the creation date of the tables,TBLMODIFY indicates the update timing of the tables, NCOLS indicates thenumber of columns in the tables, NROWS indicates the number of rows inthe tables, and TBLLIMIT indicates an access limitation to the tables.

Also, metadata for the columns include information COLNAME, DBNAME,TBLNAME, TBLOWNER, COLNAME2, CODEFILE, COLKEY, COLEXPL, COLHANDLER,COLHINFO, COLATTR, COLTYPE, COLUNIT, and COLSIZE. Such information willbe explained in turn below.

COLNAME indicates the column name. DBNAME indicates the DB name, whichis the same as that in the metadata file for the DB. TBLNAME indicatesthe table name, and TBLOWNER indicates the owner name of the table,which are the same as those in the metadata file for each table.COLNAME2 is the column name in Japanese, and CODEFILE indicates thestorage location of a metadata file pertaining to a code which indicatesthe contents of the column.

COLKEY indicates the keyword pertaining to the column, COLEXPL indicatesthe content comment of the column, COLHANDLER indicates the method ofhandling the column, COLHINFO indicates the handling information of thecolumn, COLATTR indicates the limitation attribute on the column,COLTYPE indicates the type of data stored in the column, COLUNITindicates the unit of data stored in the column, and COLSIZE indicatesthe size of data stored in the column.

Furthermore, the metadata for the code include information CODENAME,DBNAME, CODENAME2, CODEKEY, and CODEEXPL. CODENAME indicates the codename, and DBNAME indicates the DB name, which is the same as that in themetadata file for the DB. CODENAME2 indicates the code name in Japanese,CODEKEY indicates the keyword pertaining to the code, and CODEEXPLindicates the content comments of the code.

Referring back to FIGS. 2 and 3, the meta DB server 40 collects themetadata 33 of the DBs 20 from at least one DB server 30 present on thenetwork, and creates a meta DB 41. The meta DB 41 is created and managedby a meta DB management module 42 that consists of DB managementprogram. The metadata 33 on the DB server 30 are collected periodicallyin response to a request from the meta DB management unit 42 or uponupdating the DB server 30 or metadata 33.

The meta DB server 40 is installed with an HTTPD 43 comprising a meta DBretrieval receiving module 43 a and retrieval request creation module 43b. When the user terminal 10 issues a retrieval request to the meta DB41, the meta DB retrieval receiving module 43 a receives that request tosearch for corresponding metadata, and sends the retrieval result to theuser terminal 10 in the HTML format. The retrieval request creationmodule 43 b creates a retrieval condition to be issued to the DB server30 to search the DB 20 for real data in response to an instruction fromthe user who responded upon checking the retrieval result of the meta DB41.

The user terminal 10, DB 20, DB server 30, and meta DB server 40 withthe aforementioned arrangements are connected in practice to a network50, as shown in FIG. 3. In FIG. 3, these apparatuses 10 to 40 areconnected one each to the network 50, but a plurality of apparatuses maybe respectively connected. In a distributed DB system, a plurality ofapparatuses are normally present.

More specifically, one or more user terminals 10 are present on thenetwork 50. One or more DB servers 30 and one or more DB servers 40 arevisible to each user terminal 10, but the DBs 20 are invisible to theuser terminal 10.

As described above, one or more DBs 20 are present on the network 50.Physically, the DB 20 may be mounted in the same apparatus as the DBserver 30. The number of DB servers 30 is not limited when viewed fromeach DB 20.

Also, one or more DB servers 30 are present on the network 50. Thenumber of DBs 20 is not limited when viewed from each DB server 30. Notethat the conventional DB server has neither the metadata providingmodule 31b nor metadata 33.

Furthermore, one or more meta DB servers 40 are present on the network50. A plurality of user terminals 10 and DB servers 30 may be visible toeach meta DB server 40, but the DBs 20 are invisible to the meta DBserver 40.

Conventionally, when desired data is retrieved from one of the DBs 20distributed on the network 50, the user must connect to one or more DBservers 30 in turn to search the DBs 20. However, in this embodiment,the user connects to the newly added meta DB server 40 first to searchfor metadata, thus detecting the DB 20 that stores real data to beretrieved. After that, a retrieval condition to be issued to the DBserver 30 corresponding to that DB 20 is created and issued from theretrieval request transfer module 11b to the DB server 30, thusobtaining the retrieval result.

The operation of the database system according to this embodiment withthe above arrangement will be explained below.

As preparation prior to search, the administrator of the DB server 30creates metadata 33 of the DB 20 managed by that server. The createdmetadata 33 are sent to the meta DB server 40 in response to a requestfrom the meta DB server 40. The meta DB server 40 collects the metadata33 from the plurality of DB servers 30 to create the meta DB 41.

Upon actual search, the user inputs a desired keyword to search the metaDB 41 of the meta DB server 40 for metadata. With this search, one ormore metadata that match the input keyword are extracted from the metaDB 41. Since each metadata contains the URL of the corresponding DB 20or bB server 30, the user can find out the DB 20 that stores real datato be retrieved at that time, and can select the DB 20 to be accessed.Of course, the user can access all the DBs 20 found by search.

The user who obtained the retrieval result of the metadata creates aretrieval condition to be issued to the DB server 30 to search the DB 20for real data. A GUI control window used for creating the retrievalcondition is provided to the user terminal 10 on the basis of theretrieval result and the like of metadata by the meta DB server 40. Whenthe user presses a GUI button for starting search after he or shecreated the retrieval condition, a retrieval request having thatretrieval condition is directly transferred to the DB server 30 withoutthe intervention of the meat DB server 40, thus retrieving real data.The retrieval result is sent back to and displayed on the user terminal10.

The retrieval result that the meta DB server 40 presents to the userterminal 10 may often simultaneously include a plurality of DB servers30 or DBs 20. In such case, the GUI control window for creating theretrieval condition provided by the meta DB server 40 is displayed inthe predetermined format without exception. Hence, the user can create aretrieval condition irrespective of a plurality of distributed DBservers 30 or DBs 20, and can issue it to a desired DB server 30 inaccordance with the URL contained in the metadata retrieval result.

In this manner, according to the database system of this embodiment,when the user initially makes a plausible search using metadata, he orshe can extract a plurality of metadata that matches an inputappropriate keyword from the meta DB 41. The extracted metadatasummarize the contents and the like of one or more DBs 20. Hence, theuser can easily obtain information of all the DBs 20 that match his orher required keyword, independently of the locations of the distributedDBs 20 (DB servers 30).

More specifically, the user can obtain information of all the differentDBs 20 from the single meta DB server 40 by a single search withoutrequiring immediate connectivity to the DBs 20 (DB servers 30). Sincethe locations (URLS) of the extracted DBs 20 (DB servers 30) arecontained in the metadata, which are presented to the user, the userneed only know the location (URL) of the meta DB server 40. Therefore,the user need not perform any troublesome operations, e.g., fordetecting and accessing all distributed DB servers 30 in turn.

In addition, the latest information (metadata) of each DB 20 present onthe network 50 is sent to the meta DB server 40 via periodiccommunications between the DB server 30 and meta DB server 40. Thus, thelatest information at the time of search can be obtained as a retrievalresult from the contents of the meta DB 41, thus realizing accurate dataretrieval. The metadata in the meta DB 41 are automatically updatedbetween the DB server 30 and meta DB server 40. As a result, the userneed not understand the complex processes of registration and deletionof data constantly done by the DB server 30 or connection/disconnectionof the DB server 30 itself to the network 50.

Furthermore, in this embodiment, when the metadata 33 has been updatedon the DB server 30, the updated metadata is transmitted not activelyfrom the DB server 30 to the meta DB server 40 but in response to arequest from the meta DB server 40. With this control, the DB server 30is completely independent from the meta DB server 40. Hence, the DBserver 30 need not know the meta DB server 40 to which it providesmetadata. The meta DB server 40 can rewrite the contents of the meta DB41 on the basis of metadata sent from all the active DB servers 30 inresponse to a request from itself.

In this embodiment, only one meta DB server 40 is present on the network50. However, one or more meta DB servers 40 may be present. If a mirrorserver having the same or different contents as or from those of themeta DB server 40 is added, reaction of the meta DB server 40 inresponse to the request from the user terminal 10 can be prevented frombeing delayed due to heavy traffic on the network.

The operation of the database system according to this embodimentmentioned above will be described in detail below with reference to theflow charts shown in FIGS. 4 to 6.

Note that FIG. 4 is a flow chart showing the flow of the overallprocessing, FIG. 5 is a flow chart showing the flow of meta DB updateprocessing (by a meta DB management program), and FIG. 6 is a flow chartshowing a series of search processes in correspondence with the userterminal 10, DB server 30, and meta DB server 40.

Referring to FIG. 4, the administrator of the DB server 30 creates andregisters metadata 33 that pertain to the DB 20 in step S1. Thesemetadata 33 can be looked up from other machines on the network 50 viathe HTTPD 31 on the DB server 30. In this embodiment, in order toaccurately and efficiently search the meta DB 41, the metadata 33 can becreated in a plurality of layers, as shown in, e.g., FIG. 7.

For example, this embodiment assumes a table style relational databaseas the DB 20, and uses a four-layered structure, i.e., a DB layer, tablelayer, column layer, and code layer. The DB layer describes informationthat pertains to the overall DB 20, the table layer describesinformation that pertains to tables in the DB 20, the column layerdescribes information that pertains to columns (items) in the tables,and the code layer describes information that pertains to values in thecolumns. Each layer includes the contents and keyword of that layer, andat least the DB layer includes the URL of the DB 20 (destination of theretrieval condition).

In step S2, the administrator of the meta DB server 40 registersinformation required for acquiring metadata 33 from the DB servers 30,i.e., information of each DB 20 provided by the DB server 30 in apredetermined file in correspondence with DB servers 30 to be supported.An item mandatory for registration is at least the URLs of the DB layermetadata files of the respective DBs 20.

After information of each DB 20 to be supported is registered, the metaDB server 40 acquires metadata 33 pertaining to the registered DBs 20from the DB servers 30 in step S3, and registers the acquired metadata33 in the meta DB 41 in step S4. More specifically, the meta DB server40 collects all metadata 33 from a plurality of registered DB servers30, and creates the meta DB 41 based on these data by modifying thecollected data so as to provide better service to the user.

When the administrator of the meta DB server 40 wants to add a new DB 20to be supported, he or she need only add one sentence describinginformation of the corresponding DB 20 in the above-mentioned file. Inthis way, the meta DB server 40 automatically reads metadata 33 from thecorresponding DB server 30 and updates the contents of the meta DB 41.As described above, this embodiment can provide a flexible system whichcan very easily add or delete DBs 20 to be supported.

The preparation prior to search is completed. After that, when a searchis made actually, the user inquires of the meta DB server 40 using theWWW browser 11 in step S5. Then, the meta DB server 40 searches for DBs20 that match the user's inquiry using the meta DB 41 in step S6. Then,a retrieval condition creation form page (GUI control window) is formedusing the retrieval result and metadata and sends it to the userterminal 10 in step S7.

The user checks in step S8 if the retrieval result is satisfactory. Ifthe result is not satisfactory, the flow returns to step S5, and theuser inputs a keyword different from the previous one to redo search. Ifthe user is satisfied with the retrieval result, the flow advances tostep S9. In step S9, the user creates a retrieval condition used forretrieving real data from the extracted DB 20 using the presentedretrieval condition creation form page, and issues it as a retrievalrequest to the DB server 30.

Assuming the above-mentioned relational DB, the location of informationto be retrieved (i.e., the table and column locations in thecorresponding DB 20) and various other conditions may be input as theretrieval condition. The retrieval condition to be input can be freelydetermined depending on the application used. When a relational DB isused, the retrieval request to the DB server 30 is created and issued inaccordance with the SQL format. The destination of the retrieval requestis presented in the form of URL in metadata, and the retrieval requestis automatically issued to the destination upon operation of a retrievalexecution button.

Upon receiving the retrieval request, the DB server 30 translates theretrieval request into the format the matches the DB 20 in step S10.More specifically, the retrieval condition is created between the userterminal 10 and meta DB server 40 in a predetermined format irrespectiveof the types of DBs 20 distributed on the network 50. For this reason,when the retrieval request is actually issued to the individual DBs 20,a standard retrieval request is converted to a format that can beinterpreted by each DB 20.

In this way, the user need only create and issue a retrieval conditionaccording to a standard format using the above-mentioned retrievalcondition creation form presented by the meta DB server 40 withoutrequiring immediate connectivity to the individual distributed DBs 20.That is, DBs 20 on the network 50 appear equivalent to each other fromthe user side, and disparate, distributed DBs 20 can be integrated.

Upon reception of the translated retrieval request, the DB server 30issues a retrieval request (SQL) to the DB 20 in place of the user toretrieve real data in step S11. The retrieval result is sent back to theuser terminal 10 and displayed in step S12, thus ending a series ofsearch processes.

The update processing (steps S3 and S4 in FIG. 4) of the meta DB 41 bythe meta DB server 40 will be described in detail below with referenceto FIG. 5.

Referring to FIG. 5, the URL of each DB server 30 to be supported andits check interval are acquired from a setup file in the meta DBmanagement module 42 in step S21.

The check interval indicates the time interval upon periodically issuinga collection request of metadata 33 from the meta DB server 40 to the DBserver 30. The check interval may be set in advance in the meta DBserver 40, or the administrator of the DB server 30 may describe theinterval in metadata 33 upon creating metadata 33 and transmit themetadata 33 to the meta DB server 40 to store the interval in the setupfile.

In step S22, the control waits based on the set check interval until thecheck timing of the DB server 30 is reached. If the check timing hasbeen reached, the flow advances to step S23 to acquire information ofthe DB layer of the metadata 33 from the DB server 30. The DB layercontains serial number data, which is counted up every time the contentsof the metadata 33 have been updated. By checking this data in step S24,it can be determined if the metadata 33 has been changed.

If the metadata 33 remains the same, the flow returns to step S22 towait until the next check timing. On the other hand, if the metadata 33has been changed, the flow advances to step S25 to acquire theinformation of that metadata 33 from the DB server 30. It is checked instep S26 if the metadata 33 has been updated. If the metadata 33 hasalready been updated, new metadata 33 is saved, and its update time isrecorded in step S27.

It is then checked in step S28 if all the metadata 33 that pertain tothe DB 20 which is being checked have been examined. If metadata 33remain unexamined, the flow returns to step S25 to acquire such metadataif all the metadata 33 have been examined, it is checked in step S29 ifthere is another DB server 30 that has reached the check timing. If suchserver is found, the flow returns to step S23 to repeat the aboveprocessing.

In this way, after all the metadata 33 that pertain to the correspondingDBs 20 are acquired from all the DB servers 30 that have reached thecheck timing, the contents of the meta DB 41 are updated based on theacquired information in step S30. The next check timings of the checkedDB servers 30 are recorded in step S31, and the flow then returns tostep S22.

A series of search processes (steps S5 to S12 in FIG. 4) executed by theuser terminal 10, DB server 30, and meta DB server 40 will be describedin detail below with reference to FIG. 6.

Referring to FIG. 6, the user connects the user terminal 10 to the metaDB server 40 via the network in step S41. In response to thisconnection, the meta DB server 40 sends a search form of the meta DB 41to the user terminal 10 in the HTML format in step S42.

The user designates a keyword using the above-mentioned search form, andissues a retrieval request to the meta DB server 40 in step S43. Inresponse to the request, the meta DB server 40 executes a full-textsearch of the meta DB 41 using the input keyword in step S44, and sendsback the retrieval result to the user terminal 10 in the HTML format instep S45. The user evaluates the retrieval result in step S46, and ifthe result is satisfactory, he or she sends a message indicating this tothe meta DB server 40.

The meta DB server 40 forms a retrieval condition creation form for theDB 20 in the HTML format using the retrieval result of the meta DB 41and metadata, and sends it to the user terminal 10 in step S47. The userdesignates the retrieval condition of the DB 20 using the presentedretrieval condition creation form, and sends it to the meta DB server 40in step S48. The meta DB server 40 sends the received retrievalcondition text (SQL) and the URL of the DB server 30 corresponding tothe DB 20 as the destination of the text to the user terminal 10 in stepS49.

Upon reception of such information, the user terminal 10 issues aretrieval condition statement (SQL) to the DB server 30 indicated by thereceived URL as a retrieval request in step S50. Upon reception of theretrieval request, the DB server 30 translates the retrieval requestinto the format concordant with the DB 20 in step S51. The DB server 30issues a retrieval request (SQL) to the DB 20 to retrieve real data instep S52, and transmits the obtained retrieval result to the userterminal 10 in step S53. The user terminal 10 acquires the retrievalexecution result in step S54, thus ending a series of search processes.

The above-mentioned embodiment has exemplified the distributed DB systembuilt by connecting the user terminal 10, DB 20, DB server 30, and metaDB server 40 to the Internet. However, the network to be connected isnot limited to the Internet. For example, this embodiment can be appliedto networks other than the Internet such as a WAN (Wide Area Network),LAN (Local Area Network), intranet, and the like.

In the above embodiment, a relational DB has been exemplified as the DB20. However, the present invention is not limited to such specific DB.Also, the DB can manage various kinds of data contents such as textdata, image data, audio data, and the like.

Second Embodiment

The second embodiment of the present invention will be described belowwith reference to the accompanying drawings.

FIG. 9 is a block diagram showing an example of the arrangement of adatabase system according to this embodiment. The database system ofthis embodiment shown in FIG. 9 can be used alone or may be combinedwith the first embodiment shown in FIG. 2.

The combination with the first embodiment can be implemented when the DBserver 30 shown in FIG. 2 comprises functional blocks bounded by theone-dashed chain line in FIG. 9 other than the database 20. On the otherhand, the DB server 30 in FIG. 2 may comprise only a search engine 60and joined table generation means 65 shown in FIG. 9, and the meta DBserver 40 may comprise a metadata management means 69 and metadatastorage means 70 shown in FIG. 9.

When the database system of this embodiment is combined with the firstembodiment, in metadata to be used, other joinable DB names (JOINABLE)are added into the DB layer file, map information (JOINMAP) thatdescribes the correspondence between a plurality of tables representedby a virtual table (to be described later) and columns is added into thetable layer file, and a real column name (REALCOL) is added into thecolumn layer file.

The functional blocks shown in FIG. 9 will be explained below. Referencenumeral 60 denotes a search engine which searches data in the database20 on the basis of a retrieval condition requested by an SQL statement.The database 20 stores a plurality of real tables 21, 22, 23, . . . .These real tables 21, 22, 23, . . . are stored in a format that clearsthe limitations of the RDBMS, and the search engine 60 can search therespective real tables.

Reference numeral 69 denotes a metadata management means which managesmetadata that pertain to the plurality of real tables 21, 22, 23, . . .. Note that the metadata is information used for managing data such asthe attributes, semantic contents, acquisition sources, storagelocations, and the like of data stored in the real tables 21, 22, 23, .. . , and is the same as that described in, e.g., the first embodiment.The metadata is periodically or non-periodically acquired from thedatabase 20, and is stored in a metadata storage means 70. The metadatamanagement means 69 stores the metadata in the metadata storage means70, and reads out and uses it.

For example, since the database 20 stores the real tables 21, 22, 23, .. . in FIG. 9, the metadata management means 69 acquires the table namesof the real tables 21, 22, 23, . . . and the column names of columnsthat form the tables, and stores them as real table metadata 71 in themetadata storage means 70. Since such information to be stored has aformat that can be easily described by text data, the administrator orthe like may manually create and store it.

Also, information that pertains to joining of the plurality of tables ismanaged by the metadata management means 69 as virtual table metadata72. For example, in case of metadata of a virtual table formed by thetwo real tables 21 and 22, the table names of the real tables 21 and 22and all the column names of these tables are stored as metadata.Similarly, in case of a virtual table formed by the three real tables21, 22, and 23, the three table names and all the column names of thesetables are stored.

Reference numeral 65 denotes a joined table generation means, which iscomprised of a maximum column number table extraction means 66, selectedidentical data column exclusion means 67, and table join means 68. Thejoined table generation means 65 has a function of a new virtual tableby extracting only required columns from the plurality of real tables21, 22, 23, . . . on the basis of the metadata managed by the metadatamanagement means 69. In this embodiment, predetermined columns arejoined like a conventional view, but the virtual table joined accordingto this embodiment has a format different from the conventional view.

The maximum column number table extraction means 66 extracts a realtable having the maximum number of columns that store data to beretrieved from the real tables 21, 22, 23, . . . on the basis ofmetadata managed by the metadata management means 69. At this time, whena column having the same data contents as those in a column included inthe extracted real table is also present in another real table, thatcolumn is handled as the one which belongs to the extracted real tablealone.

The selected identical data column exclusion means 67 excludes columnsthat store the same data contents as those in the columns in theextracted real table from the columns included in the remaining realtables in the subsequent table join processing, every time the maximumcolumn number table extraction means 66 extracts one real table.

The maximum column number table extraction means 66 extracts a realtable with the maximum number of columns from the remaining real tablesexcept for those extracted so far again while the corresponding columnsare excluded by the selected identical data column exclusion means 67.Likewise, the processing for extracting the real table with the maximumnumber of columns to be retrieved and the processing for excludingcolumns that store the same data as those in the extracted real tableare repeated.

The table join means 68 joins one or more real tables extracted in turnby the maximum column number table extraction means 66 to create alogically single joined table (virtual table) from one or more realtables. The information of the joined table is managed as joined tablemetadata 73 by the metadata management means 69.

As described above, in this embodiment, since columns which are includedin different real tables but store identical data contents are handledas the ones that belong to one real table, the table join means 68 neverjoins tables more than required. Also, real tables which do not storeany data to be retrieved are not extracted as the tables to be joined.

When the user searches the database, no unnecessary real table isselected as the tables to be joined, and a single joined table can beformed by a minimum required number of real tables. Also, metadata thatpertain to joining of the plurality of real tables 21, 22, 23, . . . canbe collected and managed by the single metadata management means 69,thus forming a long view beyond the physical limitations on thedatabase. That is, a table which can be seen as one table from the user,and can be seen as a plurality of joined tables from a databasemanagement system (DBMS) can be formed.

Referring to FIG. 9, an SQL statement 74 input to the joined tablegeneration means 65 is input by the user to conduct actual search. Atthis time, the user can search the virtual table managed by the metadatamanagement means 69. More specifically, the user can input a retrievalcondition disregarding the limitations on the number of columns. Whenthe SQL statement 74 input by the user includes a command for searchwithin a breadth beyond the limitations on the number of columns of theRDBMS, the joined table generation means 65 performs the aforementionedprocessing to output a search engine SQL statement 75.

The search engine SQL statement 75 is output from the joined tablegeneration means 65 in the format that allows the search engine 60 tosearch. The search engine SQL statement 75 searches the joined tableobtained by selecting a minimum required number of columns by excludingunnecessary columns. This information is managed by the metadatamanagement means 69. For example, since the joined table is the joinedtable metadata 73 indicating columns of the real tables that form ajoined table, the search engine 60 can perform search using this table.

The operation for generating the joined table according to thisembodiment with the above arrangement will be described in detail belowwith reference to FIGS. 10 to 13. Assume that data to be retrievedpertains to columns x1, x4, x8, and x9 on view X upon searching adatabase which is the same as that shown in FIG. 13.

As shown in FIG. 10, column x1 on view X corresponds to columns a1, b1,and c1 on real tables A, B, and C, column x4 on view X to column a4 onreal table A, column x8 on view X to columns b5 and c2 on real tables Band C, and column x9 on view X to column c3 on real table C. The joinprocessing of this embodiment selects one real column corresponding toeach of four columns x1, x4, x8, and x9 including data to be retrievedso as to minimize the number of joined tables upon search.

More specifically, there are a total of seven columns (links 1 to 7indicated by the arrows) on the real tables A, B, and C that pertain tofour columns x1, x4, x8, and x9 on view X, and four columns (links) areselected from these seven columns (links) by the following processing.

First, a real table having the largest number of links (the number ofcolumns including data to be retrieved) is extracted from real tables A,B, and C. In the initial state before joining tables, real table Aincludes two columns al and a4, real table B includes two columns b1 andb5, and real table C includes three columns c1, c2, and c3. Hence, sincereal table C has the largest number of links (the number of columns tobe selected), it is selected, as shown in FIG. 11.

Links 4, 6, and 7 tied to selected real table C are then selected, andlinks 1, 3, and 5 corresponding to columns a1, b1, and b5 that store thesame data as those in columns c1, c2, and c3 on real table C aredeleted. In this way, columns x1, x8, and x9 on view X handle thosebelonging to real table C alone. Note that link 2 which remainsundeleted is carried over to the subsequent processing.

Subsequently, the processing for excluding a real table with the largestnumber of links (the number of columns to be selected for search) fromremaining real tables A and B excluding selected real table C isexecuted again. In this case, since the number of links for real table Ais 1, and the number of links for real table B is zero, real table A isselected, as shown in FIG. 12, and link 2 tied to real table A isselected. As a consequence, the number of unsolved links becomes zero,thus ending the processing.

As a result of the above processing, as shown in FIG. 13, real tables Aand C are selected as those including columns x1, x4, x8, and x9 thatstore data to be retrieved from three real tables A, B, and C. Of sevenlinks 1 to 7 tied to three real tables A, B, and C, four links 2, 4, 6,and 7 are selected, and remaining links 1, 3, and 5 are deleted. Then, ajoin condition is appended between two selected tables A and C. In thisway, a new table is formed by four columns, i.e., column a4 on realtable A, and columns c1, c2, and c3 on real table C. This table is theaforementioned joined table (virtual table).

In this embodiment, since high-grade SQL analysis is done aspre-processing of the so-called DBMS, only a join condition required forsearch can be produced, and the number of tables to be joined can bereduced. In this way, the memory area of the computer can be preventedfrom being wasted, and the generation time of a search signal (Query)and the search response time can be shortened.

In this embodiment, the number of real tables is three. However, thepresent invention is not limited to such specific number of tables. Ingeneral, as the number of real tables becomes larger, the join processbecomes more complicated, and the load on the computer becomes veryheavy. However, the database system of this embodiment is particularlyeffective when the number of real tables is large.

Each of the above embodiments is implemented by supplying a softwareprogram code that implements the function of the embodiment to acomputer in an apparatus or system connected to various devices so as tooperate the devices to implement the functions, and by operating thedevices according to a program stored in the computer (CPU or MPU) ofthe system or apparatus. Alternatively, each embodiment may beimplemented by hardware.

In this case, the software program code itself implements the functionsof the above embodiment, and the program code itself and means forsupplying the program code to the computer, e.g., a recording mediumthat stores the program code constitutes the present invention. As arecording medium that stores the program code, for example, a floppydisk, hard disk, optical disk, magnetooptical disk, CD-ROM, magnetictape, nonvolatile memory card, ROM, and the like may be used.

The program code is also included in the present invention, when thefunctions of the above embodiment are implemented not only by executingthe supplied program code by the computer but also in collaboration ofthe program code and an OS (operating system), another applicationprogram, or the like, which is running on the computer.

Furthermore, the present invention also includes a case wherein thesupplied program code is stored in a memory mounted on a functionextension board of the computer or a function extension unit connectedto the computer, and a CPU or the like mounted on the function extensionboard or unit executes some or all of actual processes on the basis ofthe instruction of the program code to implement the functions of theembodiment.

As described in detail above, according to the present invention, sincemetadata that pertain to real data stored in one or more databases arecollected and managed by a second server, and metadata that match aretrieval condition are extracted by search of the second server, evenwhen a plurality of databases and first servers that manage thesedatabases are present, all metadata that match the retrieval request canbe extracted by search of the second server. For this reason, the usercan obtain information of different databases from a single secondserver without requiring immediate connectivity to the distributeddatabases (first servers). That is, the user need not make anycumbersome operations for detecting and accessing all the distributedfirst servers one by one, and can easily find a database that storesdesired real data.

According to another feature of the present invention, since the firstserver comprises means for converting a retrieval request for a databasetransferred from a user terminal into a format that matches the databaseto be accessed, the user need only create and issue a retrievalcondition according to a standard format without taking notice of thesites of the individual distributed databases, thus integratingdifferent types of distributed databases and facilitating user's searchoperation.

According to still another feature of the present invention, since thesecond server comprises a function of acquiring metadata when the firstserver has updated the metadata or at predetermined time intervals, thelatest information at the time of search can be obtained as a retrievalresult from the metadata collected by the second server, thus realizingaccurate data search and flexibly coping with system changes. Since suchupdating of metadata is done automatically, the user need not understandthe complex processes of registration and deletion of data constantlydone by the first servers or connection/disconnection of the firstserver itself to the network, thus reducing the load on the user.

According to still another feature of the present invention, processingfor extracting one table including columns that store data to beretrieved from a plurality of tables, and processing for excludingcolumns on the extracted table and those on other tables which store thesame data contents as those on the extracted table from the columns tobe extracted are repeated, and tables extracted in turn by the tableextraction means are joined. Hence, columns which are located ondifferent tables but store identical data contents are handled as thosebelonging to one table, and tables more than required can be preventedfrom joined. Also, a table which includes no columns that store data tobe retrieved can be excluded from the tables to be joined, and onejoined table can be generated using a minimum required number of tables.Hence, since a search can be made within a relatively narrow breadthobtained by excluding unnecessary tables, a database system which canattain high-speed processing while reducing the required memory capacitycan be provided.

According to still another feature of the present invention, metadatathat pertain to joining of a plurality of tables are collected andmanaged, and table extraction is done based on the metadata. Hence, aplurality of tables can be consistently managed by collecting themetadata, and a long view beyond the physical limitations such as thenumber of columns of a database can be created.

1-15. (Cancelled)
 16. A database system which searches a plurality oftables joined by a relational database, comprising: table extractionmeans for extracting one table including columns that store data to beretrieved from a plurality of tables; column exclusion means forexcluding columns of the table extracted by said table extraction meansand columns on other tables which store the same data contents as datacontents of the columns on the extracted table from columns to beextracted in subsequent processing; and table joining means for joiningthe tables extracted in turn by said table extraction means when theprocessing of said table extraction means and the processing of saidcolumn exclusion means have been repeated until all the columnsincluding data to be retrieved are analyzed.
 17. A system according toclaim 16, wherein said table extraction means extracts one tableincluding a largest number of columns which store data to be retrievedfrom the plurality of tables.
 18. A system according to claim 16,further comprising metadata management means for collecting and managingmetadata which pertain to joining of the plurality of tables, andwherein said table extraction means extracts the table on the basis ofthe metadata stored in said metadata management means.
 19. A systemaccording to claim 16, further comprising retrieval means for retrievingobjects in accordance with a retrieval key, and wherein data isretrieved from the tables which are extracted in turn and joined by saidtable extraction means.
 20. A method of data retrieval from a database,comprising repeating processing for extracting a table and processingfor excluding columns including identical data upon search by joining aplurality of tables by a relational database in such a manner that onetable including columns that store data to be retrieved is extractedfrom the plurality of tables, columns which store the same data contentsas data contents of columns on the extracted table of other tables areexcluded, and another table is extracted from the remaining tables, andjoining one or more tables extracted in turn.
 21. A method according toclaim 20, wherein upon extracting one table from the plurality oftables, one table including a largest number of columns that store datato be retrieved is extracted.
 22. A method according to claim 20,wherein data is retrieved from the one or more joined tables.
 23. Acomputer-readable recording medium recording a program for making acomputer implement the functions of: extracting one table including alargest number of columns that store data to be retrieved from aplurality of tables upon search by joining a plurality of tables by arelational database; excluding columns of the extracted table andcolumns on other tables which store the same data contents as datacontents of the columns on the extracted table from columns to beextracted in subsequent processing; and joining the tables extracted inturn when the processing of said two means have been repeated until allthe columns including data to be retrieved are analyzed.
 24. A mediumaccording to claim 23, wherein said program makes the computer furtherimplement the function of retrieving objects in accordance with aretrieval key from the tables extracted and joined by said tableextraction means.